Moonshot AI Launches Kimi Linear: 6 Times Faster Linear Attention Architecture, Open-Source KDA Kernel Released Simultaneously
The domestic team Moonshot AI released the technical report on the Kimi Linear architecture, proposing a hybrid linear architecture that can replace the full attention mechanism. This architecture achieves breakthroughs in speed, memory efficiency, and long context processing, significantly reducing the use of KV cache, combining efficiency with performance advantages, and is called the new starting point for attention mechanisms in the era of intelligent agents.